AITopics | hjb equation

Collaborating Authors

hjb equation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reinforcement Learning with Non-Exponential Discounting

Neural Information Processing SystemsApr-24-2026, 20:15:47 GMT

Commonly in reinforcement learning (RL), rewards are discounted over time using an exponential function to model time preference, thereby bounding the expected long-term reward. In contrast, in economics and psychology, it has been shown that humans often adopt a hyperbolic discounting scheme, which is optimal when a specific task termination time distribution is assumed. In this work, we propose a theory for continuous-time model-based reinforcement learning generalized to arbitrary discount functions. This formulation covers the case in which there is a non-exponential random termination time. We derive a Hamilton-Jacobi-Bellman (HJB) equation characterizing the optimal policy and describe how it can be solved using a collocation method, which uses deep learning for function approximation. Further, we show how the inverse RL problem can be approached, in which one tries to recover properties of the discount function given decision data. We validate the applicability of our proposed approach on two simulated problems. Our approach opens the way for the analysis of human discounting in sequential decision-making tasks.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: Europe (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

IsL2Physics-InformedLossAlwaysSuitablefor TrainingPhysics-InformedNeuralNetwork?

Neural Information Processing SystemsFeb-8-2026, 07:45:55 GMT

In particular, we leverage the concept of stability in the literature of partial differential equation tostudy the asymptotic behavior ofthe learned solution asthe loss approaches zero. Withthis concept, we study animportant class of high-dimensional non-linear PDEs in optimal control, the Hamilton-JacobiBellman (HJB) Equation, and provethat for generalLp Physics-Informed Loss, a wide class of HJB equation is stable only ifp is sufficiently large.

artificial intelligence, equation, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Add feedback

Is L 2 Physics Informed Loss Always Suitable for Training Physics Informed Neural Network?

Neural Information Processing SystemsDec-24-2025, 00:48:57 GMT

The Physics-Informed Neural Network (PINN) approach is a new and promising way to solve partial differential equations using deep learning. The $L^2$ Physics-Informed Loss is the de-facto standard in training Physics-Informed Neural Networks. In this paper, we challenge this common practice by investigating the relationship between the loss function and the approximation quality of the learned solution. In particular, we leverage the concept of stability in the literature of partial differential equation to study the asymptotic behavior of the learned solution as the loss approaches zero. With this concept, we study an important class of high-dimensional non-linear PDEs in optimal control, the Hamilton-Jacobi-Bellman (HJB) Equation, and prove that for general $L^p$ Physics-Informed Loss, a wide class of HJB equation is stable only if $p$ is sufficiently large. Therefore, the commonly used $L^2$ loss is not suitable for training PINN on those equations, while $L^{\infty}$ loss is a better choice. Based on the theoretical insight, we develop a novel PINN training algorithm to minimize the $L^{\infty}$ loss for HJB equations which is in a similar spirit to adversarial training. The effectiveness of the proposed algorithm is empirically demonstrated through experiments.

equation, neural network, training physics informed neural network, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback

DeepPAAC: A New Deep Galerkin Method for Principal-Agent Problems

Ludkovski, Michael, Xie, Changgen, Zhu, Zimu

arXiv.org Artificial IntelligenceDec-9-2025

We consider numerical resolution of principal-agent (PA) problems in continuous time. We formulate a generic PA model with continuous and lump payments and a multi-dimensional strategy of the agent. To tackle the resulting Hamilton-Jacobi-Bellman equation with an implicit Hamiltonian we develop a novel deep learning method: the Deep Principal-Agent Actor Critic (DeepPAAC) Actor-Critic algorithm. DeepPAAC is able to handle multi-dimensional states and controls, as well as constraints. We investigate the role of the neural network architecture, training designs, loss functions, etc. on the convergence of the solver, presenting five different case studies.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.04309

Country:

Asia > China (0.46)
North America > United States > California (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Temporal Difference Method for Stochastic Continuous Dynamics

Settai, Haruki, Takeishi, Naoya, Yairi, Takehisa

arXiv.org Artificial IntelligenceOct-28-2025

For continuous systems modeled by dynamical equations such as ODEs and SDEs, Bellman's Principle of Optimality takes the form of the Hamilton-Jacobi-Bellman (HJB) equation, which provides the theoretical target of reinforcement learning (RL). Although recent advances in RL successfully leverage this formulation, the existing methods typically assume the underlying dynamics are known a priori because they need explicit access to the coefficient functions of dynamical equations to update the value function following the HJB equation. We address this inherent limitation of HJB-based RL; we propose a model-free approach still targeting the HJB equation and propose the corresponding temporal difference method. We establish exponential convergence of the idealized continuous-time dynamics and empirically demonstrate its potential advantages over transition-kernel-based formulations. The proposed formulation paves the way toward bridging stochastic control and model-free reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2505.15544

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Ensemble based Closed-Loop Optimal Control using Physics-Informed Neural Networks

Barry-Straume, Jostein, Verulkar, Adwait D., Sarshar, Arash, Popov, Andrey A., Sandu, Adrian

arXiv.org Artificial IntelligenceOct-22-2025

The objective of designing a control system is to steer a dynamical system with a control signal, guiding it to exhibit the desired behavior. The Hamilton-Jacobi-Bellman (HJB) partial differential equation offers a framework for optimal control system design. However, numerical solutions to this equation are computationally intensive, and analytical solutions are frequently unavailable. Knowledge-guided machine learning methodologies, such as physics-informed neural networks (PINNs), offer new alternative approaches that can alleviate the difficulties of solving the HJB equation numerically. This work presents a multistage ensemble framework to learn the optimal cost-to-go, and subsequently the corresponding optimal control signal, through the HJB equation. Prior PINN-based approaches rely on a stabilizing the HJB enforcement during training. Our framework does not use stabilizer terms and offers a means of controlling the nonlinear system, via either a singular learned control signal or an ensemble control signal policy. Success is demonstrated in closed-loop control, using both ensemble- and singular-control, of a steady-state time-invariant two-state continuous nonlinear system with an infinite time horizon, accounting of noisy, perturbed system states and varying initial conditions.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.18195

Country: North America > United States (0.67)

Genre: Research Report (0.64)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Systems and Facilities > Geothermal System for Power Generation > Advanced Geothermal System (AGS) (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Continuous Q-Score Matching: Diffusion Guided Reinforcement Learning for Continuous-Time Control

Hua, Chengxiu, Gu, Jiawen, Tang, Yushun

arXiv.org Artificial IntelligenceOct-21-2025

Reinforcement learning (RL) has achieved significant success across a wide range of domains, however, most existing methods are formulated in discrete time. In this work, we introduce a novel RL method for continuous-time control, where stochastic differential equations govern state-action dynamics. Departing from traditional value function-based approaches, our key contribution is the characterization of continuous-time Q-functions via a martingale condition and the linking of diffusion policy scores to the action gradient of a learned continuous Q-function by the dynamic programming principle. This insight motivates Continuous Q-Score Matching (CQSM), a score-based policy improvement algorithm. Notably, our method addresses a long-standing challenge in continuous-time RL: preserving the action-evaluation capability of Q-functions without relying on time discretization. We further provide theoretical closed-form solutions for linear-quadratic (LQ) control problems within our framework. Numerical results in simulated environments demonstrate the effectiveness of our proposed method and compare it to popular baselines.

machine learning, q-function, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2510.17122

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

SIMPOL Model for Solving Continuous-Time Heterogeneous Agent Problems

Salguero, Ricardo Alonzo Fernández

arXiv.org Artificial IntelligenceSep-30-2025

This paper presents SIMPOL (Simplified Policy Iteration), a modular numerical framework for solving continuous-time heterogeneous agent models. The core economic problem, the optimization of consumption and savings under idiosyncratic uncertainty, is formulated as a coupled system of partial differential equations: a Hamilton-Jacobi-Bellman (HJB) equation for the agent's optimal policy and a Fokker-Planck-Kolmogorov (FPK) equation for the stationary wealth distribution. SIMPOL addresses this system using Howard's policy iteration with an *upwind* finite difference scheme that guarantees stability. A distinctive contribution is a novel consumption policy post-processing module that imposes regularity through smoothing and a projection onto an economically plausible slope band, improving convergence and model behavior. The robustness and accuracy of SIMPOL are validated through a set of integrated diagnostics, including verification of contraction in the Wasserstein-2 metric and comparison with the analytical solution of the Merton model in the no-volatility case. The framework is shown to be not only computationally efficient but also to produce solutions consistent with economic and mathematical theory, offering a reliable tool for research in quantitative macroeconomics.

artificial intelligence, equation, simpol, (18 more...)

arXiv.org Artificial Intelligence

2509.23557

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.71)

Add feedback

Gaussian process policy iteration with additive Schwarz acceleration for forward and inverse HJB and mean field game problems

Yang, Xianjin, Zhang, Jingguo

arXiv.org Artificial IntelligenceSep-22-2025

We propose a Gaussian Process (GP)-based policy iteration framework for addressing both forward and inverse problems in Hamilton--Jacobi--Bellman (HJB) equations and mean field games (MFGs). Policy iteration is formulated as an alternating procedure between solving the value function under a fixed control policy and updating the policy based on the resulting value function. By exploiting the linear structure of GPs for function approximation, each policy evaluation step admits an explicit closed-form solution, eliminating the need for numerical optimization. To improve convergence, we incorporate the additive Schwarz acceleration as a preconditioning step following each policy update. Numerical experiments demonstrate the effectiveness of Schwarz acceleration in improving computational efficiency.

artificial intelligence, inverse problem, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2505.00909

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Filters

Collaborating Authors

hjb equation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Reinforcement Learning with Non-Exponential Discounting

IsL2Physics-InformedLossAlwaysSuitablefor TrainingPhysics-InformedNeuralNetwork?

178b306c7ee66a66db2171646e17da36-Paper-Conference.pdf

Is L 2 Physics Informed Loss Always Suitable for Training Physics Informed Neural Network?

DeepPAAC: A New Deep Galerkin Method for Principal-Agent Problems

A Temporal Difference Method for Stochastic Continuous Dynamics

Ensemble based Closed-Loop Optimal Control using Physics-Informed Neural Networks

Continuous Q-Score Matching: Diffusion Guided Reinforcement Learning for Continuous-Time Control

SIMPOL Model for Solving Continuous-Time Heterogeneous Agent Problems

Gaussian process policy iteration with additive Schwarz acceleration for forward and inverse HJB and mean field game problems